Tail paradox, partial identifiability, and influential priors in Bayesian branch length inference.

نویسندگان

  • Bruce Rannala
  • Tianqi Zhu
  • Ziheng Yang
چکیده

Recent studies have observed that Bayesian analyses of sequence data sets using the program MrBayes sometimes generate extremely large branch lengths, with posterior credibility intervals for the tree length (sum of branch lengths) excluding the maximum likelihood estimates. Suggested explanations for this phenomenon include the existence of multiple local peaks in the posterior, lack of convergence of the chain in the tail of the posterior, mixing problems, and misspecified priors on branch lengths. Here, we analyze the behavior of Bayesian Markov chain Monte Carlo algorithms when the chain is in the tail of the posterior distribution and note that all these phenomena can occur. In Bayesian phylogenetics, the likelihood function approaches a constant instead of zero when the branch lengths increase to infinity. The flat tail of the likelihood can cause poor mixing and undue influence of the prior. We suggest that the main cause of the extreme branch length estimates produced in many Bayesian analyses is the poor choice of a default prior on branch lengths in current Bayesian phylogenetic programs. The default prior in MrBayes assigns independent and identical distributions to branch lengths, imposing strong (and unreasonable) assumptions about the tree length. The problem is exacerbated by the strong correlation between the branch lengths and parameters in models of variable rates among sites or among site partitions. To resolve the problem, we suggest two multivariate priors for the branch lengths (called compound Dirichlet priors) that are fairly diffuse and demonstrate their utility in the special case of branch length estimation on a star phylogeny. Our analysis highlights the need for careful thought in the specification of high-dimensional priors in Bayesian analyses.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Robustness of compound Dirichlet priors for Bayesian inference of branch lengths.

We modified the phylogenetic program MrBayes 3.1.2 to incorporate the compound Dirichlet priors for branch lengths proposed recently by Rannala, Zhu, and Yang (2012. Tail paradox, partial identifiability and influential priors in Bayesian branch length inference. Mol. Biol. Evol. 29:325-335.) as a solution to the problem of branch-length overestimation in Bayesian phylogenetic inference. The co...

متن کامل

Consistency of Bayesian inference of resolved phylogenetic trees.

Bayesian inference is now a leading technique for reconstructing phylogenetic trees from aligned sequence data. In this short note, we formally show that the maximum posterior tree topology provides a statistically consistent estimate of a fully resolved evolutionary tree under a wide variety of conditions. This includes the inference of gene trees from aligned sequence data across the entire p...

متن کامل

Consistency of Bayesian inference of resolved phylogenetic trees

Bayesian inference is now a leading technique for reconstructing phylogenetic trees from aligned sequence data. In this short note, we formally show that the maximum posterior tree topology provides a statistically consistent estimate of a fully resolved evolutionary tree under a wide variety of conditions. This includes the inference of gene trees from aligned sequence data across the entire p...

متن کامل

Bayesian Inference for Spatial Beta Generalized Linear Mixed Models

In some applications, the response variable assumes values in the unit interval. The standard linear regression model is not appropriate for modelling this type of data because the normality assumption is not met. Alternatively, the beta regression model has been introduced to analyze such observations. A beta distribution represents a flexible density family on (0, 1) interval that covers symm...

متن کامل

Effects of branch length uncertainty on Bayesian posterior probabilities for phylogenetic hypotheses.

In Bayesian phylogenetics, confidence in evolutionary relationships is expressed as posterior probability--the probability that a tree or clade is true given the data, evolutionary model, and prior assumptions about model parameters. Model parameters, such as branch lengths, are never known in advance; Bayesian methods incorporate this uncertainty by integrating over a range of plausible values...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Molecular biology and evolution

دوره 29 1  شماره 

صفحات  -

تاریخ انتشار 2012